AH  A 137  229  DESIGN  Of  EXPERIMENTS  IN  SIMUIAT JON(U>  DESMATICS  INC 
STATE  COLLEGE  PA  D  E  SMITH  ET  AL .  JAN  04  TR  -  1  1 3-  1 3 
N000 14  -  79 -C  *  0650 


tlNCl  ASSIF  IED 


F/G9/2 


J 


DESMATICS,  INC. 


P.  O.  Box  618 

State  Collaga,  PA.  16801 

Phone:  (814)  238-9621 


Applied  Research  in  Statistics  -  Mathematics  -  Operations  Research 


DESIGN  OF  EXPERIMENTS  IN  SIMULATION 


by 

Dennis  E.  Smich 
and 

Carl  A.  Mauro 


To  appear  as  invited  paper  for  special  issue  on  simulation 
of  American  Journal  of  Mathematical  and  Management  Sciences 


TECHNICAL  REPORT  NO.  113-13 
January  1986 


This  study  was  supported  by  the  Office  of  Naval  Research 
under  Contract  No.  N00014-79-C-0650,  Task  No.  NR  062-667 

Reproduction  in  whole  or  in  part  is  permitted 
for  any  purpose  of  the  United  States  Government 

Approved  for  public  release;  distribution  unlimited 


DTIC 

SELECTE 

JAN  2  6  1984 


D 


TABLE  OF  CONTENTS 


1.  INTRODUCTION  . 

2.  FACTOR  SCREENING  . 

3.  INVESTIGATING  THE  FUNCTIONAL  RELATIONSHIP 

4.  OPTIMUM-SEEKING . 

5.  APPLICATION  OF  RSM  . 

6.  VARIANCE  REDUCTION  TECHNIQUES . 

7.  REFERENCES  . 


i- 


1.  INTRODUCTION 


In  many  research  areas  computer  simulation  is  a  very  useful  and 
powerful  technique  for  studying  the  behavior  of  complex  real-world  sys¬ 
tems.  Of  course,  there  a  number  of  Important  steps  to  a  successful  simu¬ 
lation  study  [see.  for  instance,  Gordon  (1978),  Kleijnen  (1974),  and 
Law  and  Kelton  (1982)].  Perhaps  the  two  most  critical  steps  are  (1) 
model  validation,  and  (2)  the  design  and  analysis  of  the  simulation  ex¬ 
periments.  Careful  attention  to  both  of  these  steps  is  necessary  for 
a  meaningful  and  sound  simulation  study.  For  instance,  if  a  simulation 
model  is  not  sufficiently  representative  of  the  system  under  study  (i.e., 
valid),  the  output  data  may  be  misleading  and  result  in  erroneous  con¬ 
clusions  about  the  system.  On  the  other  hand,  even  if  the  model  satis¬ 
factorily  mimics  the  system,  an  optimal  experimental  design  is  needed 
in  order  to  derive  maximum  benefit  from  the  time  and  cost  incurred. 

As  in  any  experimental  investigation,  a  simulation  study  requires 
careful  planning,  data  collection,  output  data  analysis,  and  proper  in¬ 
terpretation  of  the  experimental  data.  In  general,  the  problem  of  de¬ 
sign  is  to  ensure  that  data  relevant  to  the  proposed  study  is  obtained 
in  as  efficient  a  manner  as  possible.  We  are  primarily  concerned  in 
this  paper  with  the  problem  of  experimental  design;  we  will  assume  that 
the  simulation  model  is  an  adequate  representation  of  the  system  under 
consideration.  [General  discussions  of  simulation  validation  are  given 
by  Gass  (1977),  Law  and  Kelton  (1982),  Naylor  and  Burdick  (1975),  Schel- 
lenberger  (1974),  and  Van  Horn  (1971).]  We  will  also  assume  that  the 
computer  program  (code)  used  to  execute  the  simulation  model  has  been 


properly  debugged  so  that  the  system  model  functions  as  Intended. 

There  are  several  characteristics  that  distinguish  simulation 
experimentation  from  statistical  experimentation  in  general.  First, 
we  have  much  more  control  over  the  experimental  conditions  than 
we  do  in  the  real  world.  This  often  allows  us  to  use  the  simulation 
model  to  examine  a  number  of  "what  if"  questions  about  which  little  or 
no  data  currently  exists.  For  example,  using  a  simulation  model  of  a 
nuclear  power  plant,  we  may  wish  to  determine  what  would  happen  in  the 
event  of  a  loss-of-coolant  accident.  Second,  we  can  control  much 
of  the  underlying  randomness  in  a  simulation  by  controlling  the  streams 
of  pseudorandom  numbers  that  drive  and  determine  the  stochastic  events 
that  occur  in  a  simulation.  This  capability  often  allows  us  to 
use  variance-reduction  techniques  to  obtain  estimators  having  greater 
statistical  precision.  Moreover,  there  is  generally  no  need  for  random¬ 
ization  of  experimental  conditions  and  run  order  to  guard  against  the 
inadvertent  introduction  of  systematic  biases  and  variation.  Such  pro¬ 
tection  is  usually  provided  by  the  random-number  generators  already 
present  in  the  simulation  model.  Third,  many  simulation  developers 
attempt  to  take  into  account  as  many  detailed  aspects  of  the  system  un¬ 
der  investigation  as  possible.  As  a  result,  many  simulation  studies  are 
characterized  by  the  inclusion  of  an  exceptionally  large  number  of  in¬ 
put  variables.  Finally,  the  problems  of  missing  data  and  outliers  which 
can  handicap  and  reduce  the  effectiveness  of  any  experimental  investi¬ 
gation  are  generally  of  no  concern  in  simulation  studies.  Outliers 
(l.e.,  discordant  or  contaminant  observations)  cannot  arise  because  a 
simulation  model  is  essentially  a  "closed"  system;  missing  output  data 


can  occur  only  if  the  time  and/or  funds  allocated  for  experimentation 
are  insufficient. 

It  is  beyond  the  scope  of  this  paper  to  consider  all  the  many  facets 
of  experimental  design;  the  current  literature  in  this  subject  area  is 
vaat.  Instead  we  choose  to  discuss  the  salient  aspects  of  four  selected 
topics  which  we  feel  are  of  particular  interest  and  relevance  in  the  simu¬ 
lation  context.  These  are:  (1)  Identification  of  the  important  factors 
(i.e.,  input  variables);  (2)  Investigation  of  the  statistical  relation¬ 
ship  between  the  output  and  input  variables;  (3)  Determination  of  the 
combination  of  factor  levels  for  which  the  response  (i.e.,  output  vari¬ 
able)  is  optimized;  and  (4)  The  use  of  variance  reduction  techniques. 

We  address  each  of  these  topics  in  the  ensuing  sections.  Through¬ 
out  our  discussions  we  assume  that  there  is  but  a  single  response  vari¬ 
able,  and  we  restrict  ourselves  to  the  situation  in  Which  all  the  fac¬ 
tors  are  quantitative. 


2.  FACTOR  SCREENING 


As  noted  previously,  simulation  models  often  involve  a  great  many 
factors.  Such  models,  however,  because  of  their  size  and  running  time, 
can  require  a  prohibitively  large  and  costly  experimental  program  to 
study  their  behavior.  Therefore  we  may  want  to  concentrate  our  analysis 
on  the  set  of  "most  important"  factors.  Factor  screening  methods  [see, 
for  instance,  Kleijnen  (1975),  Montgomery  (1979),  and  Smith  and  Mauro 
(1982)]  are  statistical  methods  that  attempt  to  identify,  efficiently 
and  economically,  a  set  of  most  important  factors.  Once  the  most  im- 
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portant  factors  have  been  identified,  subsequent  simulation  experi¬ 
mentation  can  concentrate  more  intensively  on  these  critical  factors, 
thereby  eliminating  experimentation  with  relatively  unimportant  fac¬ 
tors  which  can  needlessly  consume  resources.  Screening  experiments, 
then,  are  not  usually  an  end  in  themselves  but  are  customarily  per¬ 
formed  as  a  preliminary  step  in  the  experimentation  process. 

In  screening  experiments,  we  want  (a)  to  detect  as  many  important 
factors  as  possible,  (b)  to  declare  important  as  few  unimportant  fac¬ 
tors  as  possible,  and  (c)  to  expend  as  few  runs  as  possible.  Thus, 
one  must  generally  consider  both  how  many  runs  a  screening  strategy  re¬ 
quires  and  how  accurately  it  identifies  factors.  Although  one  may  wish 
to  obtain  finer  factor  groupings  than  simply  "important"  or  "unimportant", 
to  effectively  accomplish  this  would  most  certainly  require  more  screening 
runs  than  are  normally  reasonable  or  affordable.  In  any  event,  the 
greater  (lesser)  the  degree  of  importance  a  factor  has,  the  larger 
(smaller)  should  be  the  probability  of  classifying  that  factor  as  im¬ 
portant. 

In  screening  designs,  a  relatively  small  number  of  factor  levels 
is  generally  employed;  in  fact,  most  screening  experiments  are  two-level 
experiments.  There  are  two  reasons  for  this.  First,  two  levels  of  each 
factor  are  usually  sufficient  to  detect  which  factors  have  major  effects. 
Second,  and  more  importantly,  two-level  designs  maximize  the  number  of 
factors  that  can  be  examined  in  a  given  number  of  runs  because  the  num¬ 
ber  of  factor  level  combinations  is  minimized  when  each  factor  has  only 
two  levels. 

The  full  statistical  model  for  a  two-level  complete  factorial  ex- 
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k 

per linen t  for  k  factors  contains  2  terms:  a  mean  effect,  k  main  ef¬ 
fects,  |j|  two-factor  Interaction  effects,  |^J  three-factor  interaction 
effects . a  k-f actor  interaction  effect.  To  estimate  every  effect 

in  the  full  model,  one  must  run  the  complete  factorial  experiment  con- 
k 

sisting  of  N  -  2  runs.  This  many  runs,  however,  is  rarely  practical 
in  simulation  experimentation;  for  even  a  moderate  number  of  factors 
the  implications  in  terms  of  money  invested  and  overall  run  time  can  be 
quite  overwhelming.  However,  if  we  can  reasonably  assume  that  certain 
higher  order  interactions  are  negligible,  we  can  make  a  less  than  com- 
plete  investigation  by  running  only  a  fraction  of  the  2  treatment  com¬ 
binations. 

In  this  section,  we  will  consider  two  basic  situations:  (1)  the 
unsaturated/saturated  case,  and  (2)  the  supersaturated  case.  In  the 
first  case,  one  can  afford  to  invest  more  runs  than  there  are  factors, 
but  still  considerably  less  than  2  ;  in  the  second  case,  the  number  of 
runs  available  for  screening  is  less  than  or  equal  to  the  number  of  fac¬ 
tors  to  be  screened.  For  the  remainder  of  this  section  we  present  and 
discuss  those  experimental  plans  which  are  particularly  suited  for 
screening  in  these  two  cases.  Before  proceeding  with  the  main  dis¬ 
cussion,  however,  we  digress  momentarily  to  review  a  few  fundamental 
terms  and  concepts  in  design  theory. 

2.1  Orthogonality,  Confounding,  and  Resolution 

The  levels  to  be  run  in  a  two-level  screening  experiment  can  be 
conveniently  displayed  in  a  design  matrix  such  as  is  given  in  Table  1. 
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We  have  arbitrarily  coded  the  two  levels  of  each  factor  as  +1  (high) 
and  -1  (low).  In  run  #1,  for  example,  all  factors  except  are  held 
at  their  high  level. 

When  ±1  coding  is  used,  we  call  two  design  columns  orthogonal  if 
the  sum  of  their  cross  products  is  zero.  Equivalently,  two  columns 
are  orthogonal  if  their  factor  levels  are  balanced,  i.e.,  are  different 
just  as  often  as  they  are  the  same.  Orthogonality  is  a  desirable  de¬ 
sign  property  because  estimates  of  the  (main)  effects  of  orthogonal 
factors  are  independent.  In  other  words,  if  one  of  two  orthogonal  fac¬ 
tors  has  an  effect,  it  cannot  cause  the  other,  perhaps  erroneously,  to 
appear  to  have  an  effect.  For  the  design  matrix  in  Table  1,  x^  and  x^» 
x2  and  x5>  and  and  x^  are  orthogonal. 

If  two  design  columns  are  not  orthogonal,  we  call  the  corresponding 
factors  confounded.  When  two  factors  are  confounded,  it  is  impossible 
to  statistically  separate  their  effects.  In  the  extreme  case,  two  fac¬ 
tors  are  completely  confounded  if  their  design  columns  are  identical  or 
are  reflections  of  one  another.  For  example,  in  Table  1,  Xj  and  x^  are 
completely  confounded.  Factors  Xj  and  x2»  on  the  other  hand,  although 
not  completely  confounded,  are  not  orthogonal  either.  In  this  case,  we 
say  that  they  are  partially  confounded. 

The  type  of  confounding  that  a  design  possesses  is  known  as  its 
resolution.  In  a  design  of  resolution  R,  a  p-factor  interaction  is  un¬ 
confounded  with  any  other  effect  containing  leas  than  R-p  factors.  For 
example,  in  a  resolution  III  design,  main  effects  are  not  confounded  with 
other  main  effects;  in  a  resolution  IV  design,  main  effects  are  not  con- 
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founded  with  other  main  effects  or  two-factor  interactions.  The  prin¬ 
cipal  implication  of  a  resolution  R  design  is  that  p-factor  interactions 
(p  <  R/2)  are  estimable  under  the  assumption  that  all  interactions  of 
order  R-p  or  more  are  negligible. 

The  resolution  of  a  design  is  often  restricted  by  the  number  of 
runs  that  can  be  made.  For  instance,  in  order  for  all  columns  in  a  de¬ 
sign  matrix  to  be  mutually  orthogonal,  the  number  of  runs  must  exce- 
the  number  of  factors.  Consequently,  we  can  obtain  unconfounded  es\ 
mates  of  main  effects  only  in  the  unsaturated/saturated  case.  It  f 
lows  that  in  the  supersaturated  case,  design  resolution  must  be  less  .nan 
R  ■  III,  i.e.,  we  cannot  avoid  confounding  main  effects  in  some  manner. 

2.2  The  Unsaturated/Saturated  Case 

We  now  present  two  types  of  designs  that  are  especially  useful  in 
the  unsaturated /saturated  case.  These  are  Plackett-Burman  (PB)  designs 
and  resolution  IV  foldover  designs. 


2.2.1  Plackett-Burman  Designs 


PB  designs  are  specially  constructed  two-level  minimal  resolution 
III  designs  for  studying  up  to  k  ■  4m- 1  factors  in  N  »  4m  runs.  PB  de¬ 
signs,  therefore,  are  only  available  for  numbers  of  runs  that  are  mul¬ 
tiples  of  four.  Assuming  that  all  Interactions  can  be  ignored,  unbiased 
estimation  of  the  k  main  effects  is  possible  in  a  PB  design.  The  arrange¬ 
ments  for  these  designs  were  derived  by  Plackett  and  Burman  <1946);  see 
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also  Raghavarao  (1971).  It  can  be  noted  that  when  N  is  a  power  of  two, 

PB  designs  are  the  same  as  the  well-known  resolution  III  2  *  fractional 

factorial  designs,  which  are  discussed  in  detail  by  Box  and  Hunter  (1961). 

To  analyze  PB  designs  one  can  use  standard  analysis  of  variance 
methods  and  conduct  formal  significance  testing.  A  useful  alternative 
approach  is  to  plot  the  estimated  effects  on  normal  probability  paper. 

In  this  technique,  due  to  Daniel  (1959),  negligible  effects  should  fall 
approximately  along  a  straight  line,  while  large  effects  should  tend  to 
fall  far  from  the  line.  The  latter  method  of  analysis  is  especially 
helpful  when  the  design  is  saturated  (i.e.,  when  N  *  k-1  and  no  degrees 
of  freedom  are  left  to  estimate  experimental  error)  or  when  only  a  few 
degrees  of  freedom  are  available  for  estimating  experimental  error. 

2.2.2  Resolution  IV  Foldover  Designs 

Resolution  IV  foldover  designs  are  easily  constructed  by  "folding 
over"  a  resolution  III  design,  i.e.,  the  design  matrix  D  can  be  written 
as 

*•[;] 

where  the  matrix  D*  is  a  PB  design  matrix.  Such  designs  have  resolution 
IV  and  allow  us  to  study  up  to  k  factors  in  N  =  2k  runs  where  N  is  a 
multiple  of  eight.  In  these  designs  unbiased  estimates  of  main  effects 
can  be  obtained  even  if  two-factor  interactions  exist. 
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2.2.3  Additional  Remarks 


For  screening  In  the  unsaturated/saturated  case,  resolution  III  and 
IV  designs  usually  suffice.  A  resolution  IV  design,  of  course, 
provides  more  reliable  information  than  a  resolution  III  design  but  re¬ 
quires  twice  as  many  runs.  If  the  simulation  user  is  willing  to  invest 
in  more  than  k  but  less  than  2k  runs,  he  or  she  may  wish  to  consider 
other  possible  main-effects  designs,  such  as  "D-optimal"  designs.  For 
construction  of  D-optimal  designs  we  refer  the  reader  to  the  extensive 
literature  on  these  designs;  see,  for  instance.  Box  and  Draper  (1971), 
Dykstra  (1971),  Mitchell  (1974),  and  St.  John  and  Draper  (1975).  We 
should  remark,  however,  that  PB  designs  are  D-optimal  for  their  number 
of  runs.  Another  interesting  design  optimality  criterion  is  that  of 
"tr (L)-optimality"  for  detecting  the  presence  of  two-factor  interactions. 
These  designs  are  studied  by  Morris  and  Mitchell  (1983). 

2.3  The  Supersaturated  Case 

The  supersaturated  case  arises  when  there  is  a  severe  limitation 
on  the  number  of  runs  available  for  screening.  Such  situations  are  fre¬ 
quently  encountered  in  simulation  studies,  especially  in  the  analysis  of 
large-scale  models.  The  design  situation  of  fewer  runs  than  factors  has 
received  relatively  little  attention  in  the  statistical  literature,  how¬ 
ever.  In  fact,  the  performance  characteristics  of  the  supersaturated 
methods  presently  available  are  largely  unknown. 

In  the  following  subsections  we  describe  four  basic  types  of  designs 
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that  have  been  proposed  for  use  In  supersaturated  situations.  These 
are:  random  balance  (RB)  designs,  systematic  supersaturated  (SS)  de¬ 
signs,  group  screening  (GS)  designs,  and  RB/PB  combination  designs.  Each 
of  these  design  strategies  is  characterized  by  having  an  equal  number 
of  runs  at  the  high  and  low  levels  of  each  factor.  These  designs,  there¬ 
fore,  are  of  resolution  II.  That  is,  main  effects  are  not  confounded 
with  the  overall  mean  effect. 

2.3.1  Random  Balance  Designs 

In  a  two- level  RB  design,  each  column  of  the  design  matrix  consists 
of  N/2  +l's  and  N/2  -l's  where  N  (an  even  number)  denotes  the  total  num¬ 
ber  of  runs  to  be  made.  The  +l's  and  -l's  in  each  column  are  assigned 
randomly,  making  all  possible  combinations  of  N/2  +I's  and  N/2  -l’s 
(there  are  in  equally  likely,  with  each  column  receiving  an 

independent  randomization. 

The  principal  advantage  to  the  RB  method  is  its  flexibility;  the 
sample  size  N  is  fixed  by  the  simulation  analyst  and  can  be  selected  in¬ 
dependently  of  the  number  of  factors,  k,  to  be  screened.  A  second  ad¬ 
vantage  is  the  ease  with  which  we  can  prepare  RB  designs  regardless  of 
the  magnitudes  of  N  and  k. 

There  are  two  main  disadvantages  to  RB  sampling.  The  first  of 
these  is  that  factors  are  confounded  to  a  random  degree.  Thus,  one 
cannot  generally  control  the  amount  of  confounding  or  interdependence 
between  factors.  Secondly,  there  is  no  specific  or  unique  technique  for 
analyzing  RB  designs.  The  simplest  approach  is  to  consider  each  factor 
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separately  and  apply  some  standard  analysis  technique  such  as  a  normal- 
theory  F-test.  More  sophisticated  analysis  methods  include  variable 
selection  procedures  such  as  stagewise  and  least-squares  stepwise  re¬ 
gression  methods.  For  a  more  complete  discussion  of  RB  experimentation 
we  refer  to  Anscombe  (1959),  Budne  (1959),  Satterthwaite  (1959),  and 
Youden,  et.  al.  (1959). 

2.3.2  Systematic  Supersaturated  Designs 

Because  of  the  random  confounding  that  occurs  in  RB  designs.  Booth 
and  Cox  (1962)  introduced  two-level  designs  which  systematically  attempt 
to  minimize  confounding.  Noting  that  not  all  design  columns  can  be  or¬ 
thogonal  when  N  k.  Booth  and  Cox  constructed  designs  that  minimize 

max  | c  . |  where  c  is  the  inner  product  of  design  columns  i  and  j.  Pre- 
l¥j  ij  ij 

sumably,  SS  designs  are  the  best  alternative  to  orthogonal  designs,  which 
are,  of  course,  impossible  to  construct  in  the  supersaturated  case. 

Booth  and  Cox  tabulated  their  designs  for  various  values  of  N  and 
k(k  £  36)  and  outlined,  for  other  combinations  of  N  and  k,  an  iterative 
computer  procedure  for  generating  the  required  designs.  They  admit, 
however,  that  the  cost  of  writing  and  running  the  program  may  be  pro¬ 
hibitive  if  k  is  large.  An  important  concern,  then,  with  SS  designs  is 
their  availability. 

2.3.3  Group  Screening  Designs 

GS  designs  have  been  studied  by  Li  (1962),  Patel  (1962),  and  Watson 
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(1961).  In  a  GS  design  the  individual  factors  are  partitioned  into 
groups  of  suitable  sizes.  The  groups  are  then  tested  by  considering 
each  as  a  single  factor.  Because  the  number  of  groups  is  generally  much 
smaller  than  the  original  number  of  factors,  we  can  usually  study  the 
group  factors  in  a  standard  orthogonal  design  such  as  a  PB  design.  More¬ 
over,  we  can  repeat  the  grouping  and  testing  process  for  any  number  of 
stages.  At  a  given  stage,  however,  we  repartition  only  those  factors 
within  groups  determined  to  have  significant  effects  in  the  preceding 
stage. 

The  level  of  a  group-factor  is  defined  by  assigning  the  group  level 
(e.g.,  +1)  to  all  component  factors.  This,  of  course,  induces  complete 
confounding  of  the  factors  within  a  group,  which  is  the  basic  idea.  At 
each  stage  of  screening  we  can  eliminate  the  individual  factors  from 
those  groups  which  appear  relatively  unimportant. 

The  main  advantage  of  GS  designs  is  that  we  can  to  some  extent  con¬ 
trol  the  confounding  pattern.  There  are  two  corresponding  disadvantages . 
First,  the  number  of  runs  required  by  a  GS  experiment  is  not  fixed  but 
is  random.  Second,  the  possibility  exists  that  effects  may  cancel  with¬ 
in  a  group.  As  a  simple  example,  consider  two  factors  which  have  effects 
that  are  negatives  or  near  negatives  of  each  other.  If  these  two  factors 
are  the  only  important  factors  in  a  group,  their  effects  will  cancel  or 
their  combined  effect  may  be  masked  by  experimental  error.  Mauro  (1983a) 
and  Mauro  and  Smith  (1982)  have  examined  the  cancellation  problem.  Their 
results,  obtained  under  certain  simplifying  assumptions,  tend  to  Indicate 
that  cancellation  does  not  pose  a  major  problem  to  GS. 
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2.3.4  RB/PB  Combination  Designs 


An  RB/PB  screening  plan  is  a  two-stage  strategy  having  an  RB 
first-stage  experiment  followed  by  the  use  of  a  PB  second-stage  ex¬ 
periment.  A  factor  is  included  in  the  second-stage  PB  design  only  if 
it  is  determined  to  have  a  significant  effect  in  the  first  stage. 

Aa  in  GS  designs ,  a  disadvantage  of  RB/PB  designs  is  that  the 
total  number  of  runs  required  is  random  (since  the  number  of  second- 
stage  runs  is  random).  An  advantage  of  these  designs  is  that  the  use 
of  a  PB  experiment  in  the  second  stage  separates  any  confounding  between 
the  factors  that  are  carried  over  from  the  RB  first-stage  experiment. 

2.3.5  Further  Discussion 


Because  of  the  lack  of  comparative  performance  data,  there  are 
currently  no  definitive  guidelines  for  the  selection  and  use  of  super¬ 
saturated  screening  methods.  Nevertheless,  of  the  supersaturated  screening 
methods  presently  available,  the  GS  method  has  been  generally  reconmended. 
Mauro  (1983b),  however,  has  recently  pointed  out  certain  practical  con¬ 
siderations  that  make  group  screening  less  attractive  as  a  technique  for 
factor  screening. 

The  performance  characteristics  of  RB  and  RB/PB  designs  have  been 
studied  by  Kauro  and  Smith  (1984).  They  determined  that  RB/PB  strategies 
perform  better  than  RB  strategies  in  those  situations  where  it  is  impor¬ 
tant  that  Type  1  error  (i.e.,  the  chance  of  classifying  unimportant  fac¬ 
tors  as  Important)  be  maintained  at  a  low  level.  In  compering  SS  with 
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RB  designs.  Booth  and  Cox  (1962)  concluded  that  unless  k  <  2N,  SS 
designs  have  little  advantage  over  RB  designs. 

3.  INVESTIGATING  THE  FUNCTIONAL  RELATIONSHIP 


In  many  cases  the  relationship  between  a  simulation  response  y 
and  the  k  factors  x^,...,x^  can  be  expressed  as 

y-g(Xj . x^)  +  e 

where  g  is  an  unknown  function  and  e  denotes  a  random  error  component. 

We  often  desire  to  know  what  this  relationship  is.  In  other  words, 
we  wish  to  determine  the  functional  form  of  g(x)  where  x*(x^ ,x2, . . . ,x^) . 

In  the  ensuing  discussion,  we  assume  that  both  y  and  the  x^'s  are 
not  only  quantitative,  but  also  continuous.  Furthermore,  we  assume  that 
the  error  component  £  is  normally  distributed  with  mean  0  and  variance 
a2,  where  a2  is  unknown.  Thus,  the  expected  value  of  any  observed  re¬ 
sponse  y  corresponding  to  x  is: 

E(y)  -  g(x>. 

Under  these  assumptions  response  surface  methodology,  or  RSM  for 
short,  proves  valuable.  RSM,  which  is  essentially  a  blending  of  sta¬ 
tistical  experimental  design  and  regression  analysis,  has  its  foundation 
in  a  paper  by  Box  and  Wilson  (1951).  The  terminology  "response  surface" 
derives  from  the  fact  that  the  mean  response  lies  on  a  surface  in  (k+1) 
-dimensional  space. 

In  industry  RSM  has  often  been  applied  to  two  general  problems 
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associated  with  response  surfaces.  These  are: 

(a)  Describing  the  response  surface  in  some  region  of  Interest 

and  (b)  Determining  the  values  of  the  factors  which  produce  the 
optimum  response. 

This  section  addresses  the  former  topic;  the  next  section  discusses  the 
latter  topic.  A  detailed  description  of  RSM  and  its  applications  is 
available  in  Davies  (1978)  or  Myers  (1971). 

lr-n 

Basic  to  RSM  is  the  2  fractional  factorial  which  is  an  exper- 

d  k 

imental  design  consisting  of  a  specific  fraction  (1/2P)  of  the  2  pos- 
sible  points  which  form  a  full  2  factorial  experiment.  In  accordance 
with  our  previous  discussion,  we  assume  that  the  levels  of  each  factor 
in  the  fractional  factorial  are  coded  ±1. 

Under  the  assumption  that  the  function  g  may  be  expressed  In  a 
Taylor  series  expansion  about  a  point  Xq*(Xjq, *  *  * ,xk0^ *  its  value  at  a 
point  x"(xlt . . . ,x^)  is  given  by 


g(x)-g(x0)+I(xi-x  )(|f  )  +^(x rx i0)2(f*2 

1  \  i/x  1  I 3x. 

— o  \  i 


+h  l  (x  -x..)(x  -x.„) 


3JL 


1+i 


i  "iO''j  "jOy  3x,  ax 


+  •  •  • 


i  j  x 


where  the  notation  (•)  indicates  that  the  quantity  in  parentheses  is 

— o 

to  be  evaluated  at  the  point  x^.  It  should  be  noted  that  by  rearranging 
terms,  g  may  be  expressed  as  a  polynomial: 


k  k 


g(x)-80+Z81xi+E  IB^x^x^  +  .  .  . 


(3.1) 
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Depending  upon  the  region  of  interest  to  the  experimenter  (i.e., 

the  simulation  user),  it  may  be  possible  that  a  first-order  polynomial 

provides  a  good  approximation  to  g(x)  within  that  region.  We  can  use 
k—o 

a  2  p  fractional  factorial  of  at  least  resolution  III  to  fit  a  first- 
order  equation.  This  would,  of  course,  yield  the  estimate 

k 

g(x)-b0+Zbixi 


where  b^  is  an  estimate  of  8^ . 

The  b^'s  are  obtained  by  the  method  of  least  squares  through  the 
use  of  the  equation 

WX'X)*lX,£  (3.2) 

where  b  is  a  column  vector  of  the  b^'s,  %  is  a  column  vector  where  the 
entry  consists  of  the  value  of  the  response  corresponding  to  the  j— 
run,  and  X  is  the  matrix  X“[_l ,D] .  Here  ^  denotes  a  column  vector  of  +l's 
and  D  is  the  design  matrix. 

The  estimates  b^  are  uncorrelated  and,  among  all  unbiased  linear 

estimates,  have  minimum  variance.  Although  other  designs  (e.g.,  the 

simplex  designs  studied  by  Box  (1952))  also  provide  uncorrelated,  minimum 

k—p 

variance  estimates,  the  2  v  fractional  factorial  has  the  added  advantage 
of  being  able,  by  the  addition  of  specific  points,  to  evolve  directly  to 
a  second-order  design  which  can  be  used  to  estimate  quadratic  effects 
(the  Bj^'s).  This  proves  valuable  if  it  is  determined  that  a  first-order 
approximation  is  not  adequate. 

Because  simulation  runs  are  usually  at  a  premium,  it  is  a  good  idea 
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to  use  the  smallest  possible  2  p  fractional  factorial  of  resolution 
III.  These  designs  are  easily  obtained;  rules  for  their  generation 
are  given  In  Box  and  Hunter  (1961),  for  example.  The  number  of  runs* 

N,  required  by  these  designs  is,  of  course,  given  by  p  where  p  is 

the  maximum  integer  selected  such  that  2^“p  >  k. 

As  a  check  on  how  well  a  first-order  approximation  fits  the  true 
response  surface,  a  lack-of-fit  test  may  be  conducted.  In  order  to  test 
lack  of  fit,  the  center  point  of  the  fractional  factorial  should  be  run 
in  addition  to  the  N  points  in  the  2k  p  fractional  factorial.  More¬ 
over,  to  obtain  degrees  of  freedom  for  testing  lack  of  fit,  the  center 
point  should  be  replicated,  i.e.,  run  a  number  of  times.  If  this  point 
is  replicated  m  times,  then  there  will  be  m-1  degrees  of  freedom  for 
the  appropriate  error  term  for  testing  lack-of-fit. 

It  can  easily  be  shown  that  for  a  2k  p  fractional  factorial  aug¬ 
mented  with  m  runs  at  the  center  point,  the  estimated  coefficients  b^, 
bj , . . . ,b^  are  given  by 


and 


N  m 

*>  “  +  ^0  r>/<N+B> 
o  L  j  jO,r 


5yjXJi/N 


(i-1 . k) 


where  y. 


denotes  the  observed  response  for  the  J — 1  run  in  the  fractional 
factorial 


yfl  denotes  the  observed  response  for  the  r —  run  at  the  center 
,r  point,  and 

x  ,  denotes  the  (J,i)^  entry  (either  a  +1  or  a  -1)  in  the 
design  matrix. 
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Lack  of  fit  can  be  obtained  and  tested  from  an  analysis  of  vari¬ 
ance  decomposition  of  the  overall  variation  in  the  observed  simulation 
runs  corresponding  to  the  fractional  factorial  points  and  to  the  cen¬ 
ter  points.  The  replicatdd  runs  at  the  center  point  provide  the  "pure" 
error  sum  of  squares  given  by 


SS 


e 


Uy.  -  (  E  y  )/m]/(m-l). 
r-l  °‘r  r-1  °*r 


The  complete  partition  of  the  total  sum  of  squares  and  the  N+m  degrees 
of  freedom  is  given  in  Figure  1. 


It  can  be  shown  that  the  sum  of  squares  due  to  b.,  denoted  by  SS.  , 

1  bi 


is  given  by 


k. 


The  "pure"  quadratic  sum  of  squares  SS  (resulting  from  contributions 

2 

of  terms  of  the  form  x^)  is  given  by 

SSq  -  ®N(yF-yc)2/(N+m), 


where  yf  denotes  the  average  response  over  the  N  points  in  the  frac¬ 
tional  factorial  and  y^  denotes  the  average  response  over  the  m  runs 
at  the  center  point.  We  see  from  Figure  1  that  the  cross-products  sum 
of  squares  S$c,  arising  from  the  terms  of  the  type  8^  (i^j).  is  not 
available  when  k"N- 1 .  If,  however,  k<N-l,  this  term  may  be  obtained 
most  easily  by  subtracting  all  other  sums  of  squares  from  the  total 
sum  of  squares,  SSt»  where 
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Figure  1.  Partition  of  the  Sums  of  Squares  and  the 
Degrees  of  Freedom  in  the  Fractional 
Factorial  and  Center  Points 


20- 


J 


ss. 


^y;  +  £y; 


0,r . 


Under  the  hypothesis  that  a  first-order  fit  is  adequate,  each  of 

the  two  lack-of-fit  terms  SS  and  SS  should  measure  only  random  error. 

q  c 

Therefore,  the  corresponding  mean  squares  MS^  *  SS^  and  MSc  =  SSc/(N-k-l) 

should  be  approximately  the  same  size  as  the  pure  error  mean  square, 

MS  -  SS  / (m-1) . 
e  e 

Lack  of  fit  may  be  judged  by  the  appropriate  F-tests  involving 
the  ratios  MS^/MSe  and  MS^/MS^  These  ratios  may  be  compared  with  the 
upper  a  points  of  F-distributions  with  (l,m-l)  and  (N-k-1,  m-1)  degrees 
of  freedom,  respectively.  The  significance  level  ot  is,  of  course,  se¬ 
lected  by  the  experimenter,  although  O-.05  is  an  old  standby. 

If  no  lack  of  fit  is  indicated  by  either  F-test,  the  fitted  func¬ 
tional  equation  may  be  used,  within  the  factorial  region,  as  a  descrip¬ 
tion  of  the  unknown  function  g(x) .  A  significant  lack  of  fit,  however, 
indicates  that  the  first-order  model  does  not  provide  an  adequate  ex¬ 
planation  of  the  observed  data.  In  this  situation  our  next  step  would 
be  to  take  curvature  of  the  response  surface  into  account  by  fitting  a 
second-order  model  of  the  form: 

,  .  60  *  ♦  e  E6lj*lV 

This  may  be  accomplished  by  adding  the  2k  axial  points  (±Y ,0, . . . ,0) , 
(0,±y,. . . ,0) , . . . , (0,0, . . . ,ty)  to  the  existing  fractional  factorial 
points  and  center  points,  in  order  to  complete  what  is  known  as  a  cen- 


-21- 


J 


tral  composite  design  (CCD) .  This  design  has  a  number  of  excellent 
properties.  For  a  more  detailed  discussion  of  the  CCD,  including  how 
to  choose  the  value  of  y,  see  Myers  (1971). 

As  an  aside,  it  should  be  noted  that  the  decision  to  add  axial 
points  is  not  made  until  after  the  data  resulting  from  the  fractional 
factorial  and  center  points  is  analyzed.  In  many  experimental  situ¬ 
ations  this  would  dictate  the  necessity  for  statistical  blocking  be¬ 
cause  the  two  sets  of  observations  are  not  made  under  homogeneous  con¬ 
ditions.  Fortunately,  in  simulation  experiments  we  need  not  worry  about 
this,  because  the  underlying  conditions  (except,  of  course,  for  any 
generated  random  numbers)  will  not  change. 

4.  OPTIMUM-SEEKING 


Often  the  goal  of  simulation  experimentation  is  not  to  describe 
the  response  surface  in  a  given  region,  but  instead  to  obtain  an  op¬ 
timum  response.  In  other  words,  the  objective  is  to  determine  the 
values  of  x  ■  (x^,...,x^)  that  maximize  (or  minimize)  the  unknown  func¬ 
tion  g(x). 

In  a  sense,  this  type  of  problem-solving  situation  is  similar  to 
an  optimization  problem  to  be  solved  by  mathematical  programming  tech¬ 
niques.  The  major  difference  is  that  no  explicit  objective  function  is 
stated  and,  in  fact,  exists  only  implicitly  in  the  multitude  of  com¬ 
puter  instructions  in  the  programs  comprising  the  simulation.  Thus, 
the  task  of  finding  the  best  solution  cannot  rely  on  those  analytical 
methods  which  are  applicable  when  an  explicit  objective  function  exists. 
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Instead,  a  search  of  the  relevant  factor  space  must  be  made. 

In  many  cases  the  search  for  the  best  simulation  response  is 
conducted  by  an  analyst  who  estimates  factor  values  which  he/she  be¬ 
lieves  correspond  to  a  reasonably  good  solution.  The  analyst  then 
uses  these  values  as  input  to  the  simulation  and  observes  the  cor¬ 
responding  response.  He/she  may  then  postulate  new  factor  values  and 
repeat  the  process  a  number  of  times.  Unfortunately,  the  analyst's 
search  has  a  tendency  to  turn  into  a  trial-and-error  process  involving 
a  large  amount  of  analyst  effort  and  computer  time. 

As  an  alternative,  a  search  algorithm  may  be  used  for  exploring 
the  factor  space.  Smith  (1973)  examined  seven  search  algorithms  and 
concluded  that  an  RSM-based  tended  to  be  the  best  choice.  However, 
it  is  not  without  drawbacks.  For  example,  an  RSM-based  search  may 
yield  a  local  optimum  rather  than  a  global  optimum  if  local  optima 
exist. 

Optimum-seeking  via  RSM  may  involve  up  to  four  phases,  which  are: 

(1)  First-order  design  phase 

(2)  Steepest  ascent  phase 

(3)  Second-order  design  phase 

(4)  Ridge  analysis  phase. 

In  the  first-order  design  phase  we  must  select  the  initial  interval  of 

values  to  be  considered  for  each  factor.  Estimates  of  the  first-order 

effects  within  the  initial  region  defined  by  the  specified  intervals 

k— d 

may  be  obtained  from  a  2  p  fractional  factorial.  Assuming  there  is  no 
lack-of-fit,  the  estimated  coefficients  (bj,...,b^)  indicate  the  direc¬ 
tion  in  which  maximum  improvement  in  the  response  is  predicted.  This 
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direction  is  known  as  "the  path  of  steepest  ascent." 

Simulation  runs  corresponding  to  steps  out  on  this  path  are  then 
made.  These  runs  should  be  made  cautiously  since  the  prediction  be¬ 
comes  less  reliable  as  the  distance  from  the  initial  region  increases. 
(Selection  of  appropriate  step  size  is  more  an  art  than  a  science.  See 
Davies  (1978)  for  an  example.)  When  the  observed  responses  worsen,  the 
process  outlined  in  the  previous  paragraph  is  repeated  unless  lack-of- 
fit  for  the  first-order  model  is  noted.  In  that  event,  the  existing 
fractional  factorial  is  augmented  by  axial  points  to  form  a  CCD.  In 
this  second-order  design  phase  the  resulting  estimated  second-order 
equation  may  then  be  used  to  predict  the  factor  values  which  yield  the 
best  response.  If  these  values  fall  within  the  experimental  region, 
a  corresponding  simulation  run  should  be  made.  Otherwise,  the  best 
direction  in  which  to  proceed  should  be  determined,  with  simulation  runs 
then  conducted  in  that  direction.  This  involves  the  ridge  analysis 
phase.  Ridge  analysis  (Draper  (1963)]  is  the  analogue  of  the  steepest 
ascent  procedure  used  with  the  fitted  first-order  equation. 

5.  APPLICATION  OF  RSM 


In  the  two  previous  sections  we  have  only  briefly  outlined  how 
RSM  may  be  used  in  the  simulation  situation.  The  best  bet  for  the 
simulation  user  who  wishes  an  adequate  background  in  RSM  for  use  either 
in  investigating  the  functional  relationship  in  a  given  region  (Section 
3)  or  in  optimum-seeking  (Section  A)  would  be  to  read  the  statistical, 
rather  than  the  simulation,  literature.  A  thorough  study  of  the  per- 
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tinent  sections  of  Cochran  and  Cox  (1957),  Davies  (1978),  and  Myers 
(1971)  should  provide  the  information  necessary  for  applying  RSM  tech¬ 
niques.  Needless  to  say,  this  implies  a  large  investment  of  time  and 
effort,  an  investment  which  most  people  cannot  afford. 

There  is,  however,  an  excellent  alternative.  That  alternative  is 
to  consult  a  statistician  who  is  versed  in  the  practical  and  theoretical 
aspects  of  experimental  design.  As  an  aside,  it  should  be  noted  that 
because  of  the  independence  of  RSM  from  the  simulation  itself,  it  is 
feasible  to  automate  RSM  application  to  a  large  degree.  In  fact.  Smith 
(1976)  has  developed  a  modular  computer  program,  based  on  RSM,  for 
optimum-seeking  in  the  simulation  situation.  This  FORTRAN  program, 
which  may  be  used  for  constrained  as  well  as  unconstrained  optimum¬ 
seeking,  is  designed  to  function  as  an  executive  program  which  may  be 
interfaced  with  an  existing  FORTRAN-based  simulation.  Application  of 
the  automated  RSM  program  requires  only  minor  modification  of  any  simu¬ 
lation  with  which  it  is  to  be  used.  Although  not  a  panacea,  this  pro¬ 
gram  might  prove  useful. 

6.  VARIANCE  REDUCTION  TECHNIQUES 


In  Section  1  we  mentioned  that  simulation  users  can,  to  a  cer¬ 
tain  degree,  control  the  random  number  streams  used  in  simulation  ex¬ 
periments.  The  basic  idea  of  variance  reduction  techniques  (VRTs)  is 
to  exploit  this  control  in  order  to  increase  the  precision  of  the  simu¬ 
lation  results.  The  following  two  simple  examples  are  often  used  to 
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illustrate  the  potentially  beneficial  effects  of  such  techniques. 

Example  Is  Let  and  X2  denote  the  outputs  of  two  different 
system  variants  in  the  same  simulation  model.  The  statistic 
W“X^-X2  is  an  unbiased  estimator  of  the  difference  between 
the  two  mean  responses  and  has  variance  given  by  VAR(W)  - 
VAR(X1)+VAR(X2)-2COV(X1,X2).  If  we  were  to  use  independent 
streams  of  random  numbers  in  the  two  different  simulations, 
we  would  expect  COV(X^ ,X2>-0.  If,  however,  we  were  deli¬ 
berately  to  use  the  same  stream  of  random  numbers  in  the 
two  situations,  we  would  expect  COV(Xj ,X2)  X).  Thus,  W  would 
have  a  smaller  variance  than  would  occur  with  independent 
streams. 

Example  2;  Let  and  Y2  denote  two  outputs  of  the  same  sys¬ 
tem  variant  in  a  simulation  model.  The  statistic 
is  an  unbiased  estimator  of  the  common  response  mean  and  has 
varaiance  given  by  VAR(Z)«VAR(Y1>/4+VAR(Y2)/4+COV(Y1,Y2)/2. 

If  we  were  deliberately  to  use  random  input  streams  that  were 
negatively  correlated,  we  would  expect  COV(Y^,Y2><0.  Here,  Z 
would  have  a  smaller  variance  than  would  occur  with  independent 
streams. 

The  two  variance  reduction  strategies  illustrated  in  Examples  1 
and  2  are  known  as  common  random  numbers  and  antithetic  variates,  re¬ 
spectively.  These  methods  are  the  two  simplest,  most  straightforward, 
and  most  widely  applied  VRTs.  For  an  excellent  discussion  of  these  two 
techniques  we  refer  the  reader  to  Schruben  (1979)  and  Schruben  and 
Margolin  (1978).  As  noted  by  Schruben  (1979),  variances  are  not  reduced 
uniformly  by  the  use  of  these  techniques  but  are  merely  shifted  from 
more  important  estimators  to  less  important  estimators.  For  instance, 
in  Example  1,  although  VAR(X^-X2)  is  decreased,  VAR(Xj+X2)  is  increased 
by  a  corresponding  amount. 
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In  addition  to  common  random  numbers  and  antithetic  variates, 
other  principal  VRTs  include  importance  sampling,  conditional  expec¬ 
tations,  stratified  sampling,  selective  sampling,  and  control  variates 
(or  regression  sampling).  Detailed  discussions  of  these  techniques 
are  presented  in  Kleijnen  (1974)  and  Law  and  Kelton  (1982).  See  also 
Wilson  (1983).  These  techniques  vary  in  their  complexity  and  appli¬ 
cability.  In  general,  the  use  of  these  VRTs  involves  replacing  or 
modifying  the  original  sampling  procedure,  or  using  the  same  sampling 
process  but  employing  a  more  sophisticated  estimator. 

It  has  been  demonstrated  in  the  literature  that  VRTs  (in  par¬ 
ticular,  common  random  numbers  and  antithetic  variates)  can,  when  ap¬ 
propriately  applied,  significantly  increase  the  statistical  efficiency 
of  the  simulation  results.  For  example,  with  a  judicious  selection 
of  random  number  streams  the  variances  of  an  analyst-specified  subset 
of  b^'s,  the  estimates  of  the  obtained  by  least  squares  via  equa¬ 

tion  (3.2),  can  show  a  marked  decrease  compared  to  variances  resulting 
from  using  independent  input  streams. 

The  use  of  VRTs,  however,  is  not  without  its  drawbacks.  First, 
it  is  not  always  clear  if  the  use  of  a  VRT  will  result  in  a  variance 
reduction;  in  fact,  a  variance  augmentation  may  actually  result.  (See, 
for  Instance,  Kleijnen  (1974)  and  Ramsay  and  Wright  (1979).)  Second, 
analysis  of  the  output  data  is  generally  complicated  by  the  use  of  these 
techniques.  Third,  VRTs  often  result  in  increased  computing  costs  and 
analyst  effort,  which  may  offset  any  potential  gains  in  efficiency. 

In  summary,  we  find  ourselves  somewhat  ambivalent  about  the  prac¬ 
tical  worth  of  VRTs  in  simulation.  Nonetheless,  we  would  not  discourage 
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a  potential  user  of  these  techniques.  We  would,  however,  emphasize 
that  he/she  should  not  only  be  aware  of  their  potential  advantages 
and  disadvantages,  but  also  be  intimately  acquainted  with  both  the 
simulation  and  the  VRTs  under  consideration. 
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